1,701 research outputs found
Distinguishing Computer-generated Graphics from Natural Images Based on Sensor Pattern Noise and Deep Learning
Computer-generated graphics (CGs) are images generated by computer software.
The~rapid development of computer graphics technologies has made it easier to
generate photorealistic computer graphics, and these graphics are quite
difficult to distinguish from natural images (NIs) with the naked eye. In this
paper, we propose a method based on sensor pattern noise (SPN) and deep
learning to distinguish CGs from NIs. Before being fed into our convolutional
neural network (CNN)-based model, these images---CGs and NIs---are clipped into
image patches. Furthermore, three high-pass filters (HPFs) are used to remove
low-frequency signals, which represent the image content. These filters are
also used to reveal the residual signal as well as SPN introduced by the
digital camera device. Different from the traditional methods of distinguishing
CGs from NIs, the proposed method utilizes a five-layer CNN to classify the
input image patches. Based on the classification results of the image patches,
we deploy a majority vote scheme to obtain the classification results for the
full-size images. The~experiments have demonstrated that (1) the proposed
method with three HPFs can achieve better results than that with only one HPF
or no HPF and that (2) the proposed method with three HPFs achieves 100\%
accuracy, although the NIs undergo a JPEG compression with a quality factor of
75.Comment: This paper has been published by Sensors. doi:10.3390/s18041296;
Sensors 2018, 18(4), 129
Project RISE: Recognizing Industrial Smoke Emissions
Industrial smoke emissions pose a significant concern to human health. Prior
works have shown that using Computer Vision (CV) techniques to identify smoke
as visual evidence can influence the attitude of regulators and empower
citizens to pursue environmental justice. However, existing datasets are not of
sufficient quality nor quantity to train the robust CV models needed to support
air quality advocacy. We introduce RISE, the first large-scale video dataset
for Recognizing Industrial Smoke Emissions. We adopted a citizen science
approach to collaborate with local community members to annotate whether a
video clip has smoke emissions. Our dataset contains 12,567 clips from 19
distinct views from cameras that monitored three industrial facilities. These
daytime clips span 30 days over two years, including all four seasons. We ran
experiments using deep neural networks to establish a strong performance
baseline and reveal smoke recognition challenges. Our survey study discussed
community feedback, and our data analysis displayed opportunities for
integrating citizen scientists and crowd workers into the application of
Artificial Intelligence for social good.Comment: Technical repor
Scale Attention for Learning Deep Face Representation: A Study Against Visual Scale Variation
Human face images usually appear with wide range of visual scales. The
existing face representations pursue the bandwidth of handling scale variation
via multi-scale scheme that assembles a finite series of predefined scales.
Such multi-shot scheme brings inference burden, and the predefined scales
inevitably have gap from real data. Instead, learning scale parameters from
data, and using them for one-shot feature inference, is a decent solution. To
this end, we reform the conv layer by resorting to the scale-space theory, and
achieve two-fold facilities: 1) the conv layer learns a set of scales from real
data distribution, each of which is fulfilled by a conv kernel; 2) the layer
automatically highlights the feature at the proper channel and location
corresponding to the input pattern scale and its presence. Then, we accomplish
the hierarchical scale attention by stacking the reformed layers, building a
novel style named SCale AttentioN Conv Neural Network (\textbf{SCAN-CNN}). We
apply SCAN-CNN to the face recognition task and push the frontier of SOTA
performance. The accuracy gain is more evident when the face images are blurry.
Meanwhile, as a single-shot scheme, the inference is more efficient than
multi-shot fusion. A set of tools are made to ensure the fast training of
SCAN-CNN and zero increase of inference cost compared with the plain CNN
Text is All You Need: Personalizing ASR Models using Controllable Speech Synthesis
Adapting generic speech recognition models to specific individuals is a
challenging problem due to the scarcity of personalized data. Recent works have
proposed boosting the amount of training data using personalized text-to-speech
synthesis. Here, we ask two fundamental questions about this strategy: when is
synthetic data effective for personalization, and why is it effective in those
cases? To address the first question, we adapt a state-of-the-art automatic
speech recognition (ASR) model to target speakers from four benchmark datasets
representative of different speaker types. We show that ASR personalization
with synthetic data is effective in all cases, but particularly when (i) the
target speaker is underrepresented in the global data, and (ii) the capacity of
the global model is limited. To address the second question of why personalized
synthetic data is effective, we use controllable speech synthesis to generate
speech with varied styles and content. Surprisingly, we find that the text
content of the synthetic data, rather than style, is important for speaker
adaptation. These results lead us to propose a data selection strategy for ASR
personalization based on speech content.Comment: ICASSP 202
Deep Time-Stream Framework for Click-Through Rate Prediction by Tracking Interest Evolution
Click-through rate (CTR) prediction is an essential task in industrial
applications such as video recommendation. Recently, deep learning models have
been proposed to learn the representation of users' overall interests, while
ignoring the fact that interests may dynamically change over time. We argue
that it is necessary to consider the continuous-time information in CTR models
to track user interest trend from rich historical behaviors. In this paper, we
propose a novel Deep Time-Stream framework (DTS) which introduces the time
information by an ordinary differential equations (ODE). DTS continuously
models the evolution of interests using a neural network, and thus is able to
tackle the challenge of dynamically representing users' interests based on
their historical behaviors. In addition, our framework can be seamlessly
applied to any existing deep CTR models by leveraging the additional
Time-Stream Module, while no changes are made to the original CTR models.
Experiments on public dataset as well as real industry dataset with billions of
samples demonstrate the effectiveness of proposed approaches, which achieve
superior performance compared with existing methods.Comment: 8 pages. arXiv admin note: text overlap with arXiv:1809.03672 by
other author
Critical Roles of microRNA-141-3p and CHD8 in Hypoxia/Reoxygenation-Induced Cardiomyocyte Apoptosis
Background: Cardiovascular diseases are currently the leading cause of death in humans. The high mortality of cardiac diseases is associated with myocardial ischemia and reperfusion (I/R). Recent studies have reported that microRNAs (miRNAs) play important roles in cell apoptosis. However, it is not known yet whether miR-141-3p contributes to the regulation of cardiomyocyte apoptosis. It has been well established that in vitro hypoxia/reoxygenation (H/R) model can follow in vivo myocardial I/R injury. This study aimed to investigate the effects of miR-141-3p and CHD8 on cardiomyocyte apoptosis following H/R. Results: We found that H/R remarkably reduces the expression of miR-141-3p but enhances CHD8 expression both in mRNA and protein in H9c2 cardiomyocytes. We also found either overexpression of miR-141-3p by transfection of miR-141-3p mimics or inhibition of CHD8 by transfection of small interfering RNA (siRNA) significantly decrease cardiomyocyte apoptosis induced by H/R. Moreover, miR-141-3p interacts with CHD8. Furthermore, miR-141-3p and CHD8 reduce the expression of p21. Conclusion: MiR-141-3p and CHD8 play critical roles in cardiomyocyte apoptosis induced by H/R. These studies suggest that miR-141-3p and CHD8 mediated cardiomyocyte apoptosis may offer a novel therapeutic strategy against myocardial I/R injury-induced cardiovascular diseases
Magnetic dilution effect and topological phase transitions in (MnPb)BiTe
As the first intrinsic antiferromagnetic (AFM) topological insulator (TI),
MnBiTe has provided a material platform to realize various emergent
phenomena arising from the interplay of magnetism and band topology. Here by
investigating (MnPb)BiTe single
crystals via the x-ray, electrical transport, magnetometry and neutron
measurements, chemical analysis, external pressure, and first-principles
calculations, we reveal the magnetic dilution effect on the magnetism and band
topology in MnBiTe. With increasing , both lattice parameters
and expand linearly by around 2\%. All samples undergo the paramagnetic to
A-type antiferromagnetic transition with the Nel temperature
decreasing lineally from 24 K at to 2 K at . Our neutron data
refinement of the sample indicates that the ordered moment is
4.3(1)/Mn at 4.85 K and the amount of the Mn antisites is
negligible within the error bars. Isothermal magnetization data reveal a slight
decrease of the interlayer plane-plane antiferromagnetic exchange interaction
and a monotonic decrease of the magnetic anisotropy, due to diluting magnetic
ions and enlarging the unit cell. For , the application of external
pressures enhances the interlayer antiferromagnetic coupling, boosting the
Nel temperature at a rate of 1.4 K/GPa and the saturation field at a
rate of 1.8 T/GPa. Furthermore, our first-principles calculations reveal that
the band inversion in the two end materials, MnBiTe and PbBiTe,
occurs at the and point, respectively, while two gapless points
appear at 0.44 and 0.66, suggesting possible topological phase
transitions with doping.Comment: 10 pages, 7 figure
- …